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ABSTRACT 



A method, apparatus and computer program product for 
identifying and creating persistent object fragments from a 
named object. For example, a digital content description of 
a named digital object can be dynamically parsed, and 
persistent fragment identities created and maintained to 
facilitate caching. Named digital objects include but are not 
limited to: Web pages described in XML, SGML, and 
HTML. The object description is revised by replacing each 
object fragment with its newly created persistent identity. 
The revised object description is then sent to the requesting 
node. Depending upon the properties of a fragment, this can 
either enable the fragment or the revised object description 
to be cacheable at the server and/or client device. For 
example, the object description can include a dynamic part 
which would otherwise prevent the object from being 
cached. The dynamic part can be recognized and treated as 
a separate fragment from the object description. Thus the 
revised document becomes static and therefore cacheable. 
Furthermore, fragments can be nested. Other features deter- 
mine which part/segment of a named object to recognize as 
a fragment identity, based on its properties including: size; 
processing cost; and static vs. dynamic. Yet other features 
can determine which fragments to cache and replace, for 
example based on the fragment size and processing cost. 
Still other features allow different versions to be generated 
for a fragment upon request. The version created can be 
determined by the property of the requesting devices (e.g., 
handheld device or Internet appliance) and the fragment 
description. 
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IDENTIFYING, PROCESSING AND 
CACHING OBJECT FRAGMENTS IN A WEB 
ENVIRONMENT 

FIELD OF THE INVENTION 5 

Hie present invention relates generally to the analysis of 
the content of a digital document and in particular to the 
creation and maintenance of persistent fragment identities to 
facilitate caching. 

10 

BACKGROUND 

With the rapid growth of the Internet, the need for efficient 
document exchange becomes increasingly important. In 
additional to the hypertext markup language (HTML), 
Extensible Markup Languages (XML) are becoming avail- 
able that provide a meta-language for authors to design their 
own markup language. 

On the other hand, the proliferation of various non-PC 
computing devices, including: handheld devices; palmtop ^ 
devices; and various other Microsoft WINDOWS CE ™- 
based devices; set-top boxes; WEB TV; smart phones; and 
so-called Internet appliances, (hereinafter all referred to as 
Internet appliances) further complicates the presentation of 
a Web document to a client device. In a Web document based ^ 
on HTML, images are treated as separate objects pointed to 
by the Web document. A proxy/Web server may generate a 
lower resolution version or a black and white version of a 
color image to accommodate the limited capability of the 
Internet appliance. Nonetheless, these images are named 30 
persistent objects (i.e., they have separate identities which 
are their URLs). The proxy or Web server is merely trying 
to provide different versions of a named entity based on the 
capability of a receiving device. This is independent of any 
caching issues at the proxy or Web server to improve object 35 
access time. 

Various work exists to provide different versions of a 
named object in the Web environment to support Internet 
appliances access to the Web. For example, PRISM from 
Spyglass (see e.g., http:/Avww.spyglass.com) provides dif- 40 
ferent versions of images to the Internet appliance. It can 
also dynamically translate richly formatted Web documents 
into simplified Web pages to accommodate the requirements 
of the receiving devices. A means for performing on-demand 
data type-specific lossy compression on semantically typed 45 
data and tailoring content to the specific constraints of the 
clients is described in "Adapting to Newark and Client 
Variability via On-Demand Dynamic Distillation/* by A. 
Fox, et al, Proc. 7th Intl. Conference on Architectural 
Support for Programming Languages and Operating 50 
Systems, Oct. 1996. 

Using formal descriptors, such as a markup language, to 
describe a digital document provides tremendous flexibility. 
In the Internet environment, more powerful markup lan- 
guages such as XML, or a subset of the Standard General- 55 
ized Markup Language (SGML) (see e.g., ISO 8879/1986; 
and Designing XML Internet Applications, by M. Leventhal, 
et al., Prentice Hall, 1998), are being defined to augment 
HTML. The markup language description can provide rich 
information on the document structure and the final docu- 60 
ment to be generated. In fact, XML is a language that allows 
users to define their own language. For example, chemists 
can define a chemical markup language to describe a 
molecular structure. Mathematicians or scientists can define 
a math markup language to describe complex mathematical 65 
formulas. The interpretation of the markup language 
description and generation of the object can thus be com- 
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plex. It is desirable to avoid regeneration of the same 
description repeatedly. Since Web pages, objects or docu- 
ments on a common subject, or from the same company/ 
division/department or authors often have parts in common, 
there is a need to go beyond recognizing just the repeated 
references to named entities (i.e., subject already has a 
name, e.g., URL) to subparts of named entities. 

However, proxy or Web servers and client browsers today 
do not interpret the markup language to decompose a 
document or object into components, provide persistent 
identities and tracking mechanisms to facilitate caching and 
recognition of repeated occurrences of components of a 
named object. They mainly provide caching or processing 
service for named objects as a whole. For example, as 
mentioned previously, in HTML the text documents and 
images (which are separated out from the text documents by 
the authors) are all named objects and hence cache able 
entities. Another problem is that if a document includes 
dynamic content caching is not meaningful as the next 
reference to the same document URL can result in a different 
version of the document. Thus a document is not cached 
even if only a small fraction of its content is dynamic. This 
is an issue for HTML documents today and is expected to 
become more severe for XML documents, which are more 
flexible and make it easier to incorporate various types of 
dynamic information, such as data from a database. 

Thus, the need remains for a system and method for 
identifying and creating one or more persistent object frag- 
ments from named object, for example to facilitate caching. 
The present invention addresses this need. 

SUMMARY 

In accordance with the aforementioned needs, the present 
invention is directed to a method and apparatus for identi- 
fying and creating persistent object fragments from a named 
object. In one example, the present invention is directed to 
a method and apparatus for dynamically parsing a digital 
content description of a named digital object, creating and 
maintaining fragment identities to facilitate caching. 
Examples of named digital objects include but are not 
limited to: Web pages described in XML, SGML, and 
HTML. 

The present invention has features which can parse/ 
analyze the object description, identify object fragments and 
create persistent object fragment identities, and revise the 
object description by replacing each object fragment with its 
newly created persistent identity and send the revised object 
description to the requesting node. Depending upon the 
properties of a fragment, this can either enable the fragment 
to be cache able (which can be at the content/proxy server 
and the client device in the Web environment), or make the 
revised object description cacheable at the server and client 
device. For example, consider the object description of a 
purchase order which contains a dynamic part to retrieve the 
current price of a product from the database. This dynamic 
part may be a small portion of the purchase order, but would 
prevent the object from being cached. According to one 
feature of the present invention for recognizing and treating 
the dynamic part as a separate fragment from the object 
description, the revised document becomes static and there- 
fore cacheable. Furthermore, fragments can be nested. 

A method is also provided to determine which part/ 
segment of a named object to recognize as a fragment 
identity, based on its properties, which can include its size, 
processing cost to generate that segment of the object from 
its description, and other properties such as static vs. 
dynamic. 
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The present invention has yet other features to determine skilled in the art will realize that a proxy server can be 

which fragments to cache and replace. The cache manager replaced by a hierarchy of proxy servers. A client node (60) 

takes into account the fragment size and processing cost to can also run a proxy server. 

generate the fragment. FIG. 2 depicts an example of an overall architecture of a 

The present invention has still other features which allow 5 computing node having features of the present invention. In 

different versions to be generated for a fragment upon the example of an Internet or intranet environment, the node 

request. The version created can be determined by the can bc . a W< * ? T J T °^ T l c I n , As dc P ic * ed . thc 

property of the requesting devices and the fragment descrip- com P^*g nod %J* mclude: ^ ( 2S ^ a scratch P ad or 

don. Different generator, can be maintained for each type of ™™ ™™? c ' 2 * 5 > ^! ch as and 

descriptors or markup tags to generate different version^ for io ?^J^^"^ h "^ SS^-S?^?' 

a-cc * * r j • Th c memory (245) stores the server logic 240 (with details 

different types of devices. depictcd {q mQ f q prefcrably cmb ^ dicd ^ cornputer 

An example of a method for identifying object fragments executable code which may be loaded from DASD (260) 

in an object having features of the present invention com- into memory (245) for execution by CPU (250). The server 

prises the steps of: analyzing an object description to iden- logic (240) includes an object request handler (205) (with 

tify one or more persistent object fragments associated with 15 details depicted in FIG. 7), and a fragment request handler 

the object; creating the one or more persistent object (210) (with details depicted in FIG. 11). It also maintains a 

fragments, in response to said analyzing; and creating a fragment cache (270), an object cache (275) and a fragment 

persistent object fragment identity for a persistent object description table (280) (with detailed depicted in FIG. 5). 

fragment, based on one or more of formal descriptors or an n ^ information can either reside in persistent storage (260) 

object fragment property. In one embodiment the object 20 or m main memory (245). 

description is revised by replacing at least one object frag- * n a preferred embodiment, an XML-like document will 

ment with an associated persistent object fragment identity be used as m example of a document described using some 

to enable the fragment to be cacheable at one or more of a formal language, such as a markup language. FIG. 3 shows 

server and a client; and the revised object description is sent ™ e L xam P le ° f ™ XML-hke document. The key point here 

to the client. The client receives the revised object descrip- 25 * mat th * doc^ent mcbdes mulUple segments (330) 

t - j , 4 , ■ j u- . j • where each segment (330) is enclosed between a start-tag^ 

tion and processes and/or caches the revised object descrip- , i\ « m.av ^ * « , 

lion, me client can also receive a version of the one or more ( 31 ,°> ™ d ™ ., e ° d " ^ (3 ^'. F ™ ™T* X *' ^ 

object fragments associated with the fragment identity, ^ ^ order> .. and ; <d , b ; , » * a 

wherein the version is generated at the serv« and is based ff"^ < 3 ™> J"* ** <»"«P°^S «d-tag (320) is 

on the capability of the client (e.g., whether it is a handheld 30 </cml - ™ e< l f \^ mdet> ^ ' 

device, a set top box, or an Internet appliance. respectively). As depicted the segments may be nested. For 

r rv example, the segment with the <pnce> start-tag is included 

BRIEF DESCRIPTION OF THE DRAWINGS within the segment with the <m: order> start- tag. Thus 

parsing the document to recognize the segments can be done 

These, and further, objects, advantages, and features of 35 by matching each "end-tag" with the corresponding "start- 

the invention will be more apparent from the following tag*', which is the first preceding "start-tag" of the same type 

detailed description of a preferred embodiment and the at the same nested level. In markup languages such as XML, 

appended drawings wherein: each segment can have a DTD (document type definition) to 

FIG. 1 is a diagram of an Internet environment having describe the semantics of the markup. It is an object of the 

features of the present invention; *o present invention to select a subset of the segments con- 

FIG. 2 is a more detailed example of a network environ- < ained m a document recognize them as persistent object 

ment having features of the present invention; fragments. Fragment creation eligibility criterion will be 

™« - . , . introduced next to determine when an object fragment 

HG. 3 depicts an example of a digital document using a should be creatcd fa ^ prekmd 6mDodimenlf ^ Ktl of 

p guage, 45 creation eligibility criterion are considered. For each persis- 

FIG. 4 depicts an example of a modified document; tent object fragment, a persistent identity or name is 

FIG. 5 depicts the data structure of the fragment descrip- assigned and tracked so that if the object fragment appears 

tion table; in multiple objects or multiple times in the same object, it 

FIG. 6 is an example of the server logic of FIG. 2; wuT be recognized as the same fragment. 

FIG. 7 is an example of the object request handler; 50 ^ & si fragment creation eligibility criterion is to rec- 

RG. 8 is an example of the object parser; a ? d f u e P arate out a f S ment as ( ^ eai 50 

ir ast0 ma ^ e me remaining document cacheable at the server 

FIG. 9 is an example of the next segment locator; or clieQt device and/or pr0 cessable/interpretable at the client 

FIG. 10 is an example of the persistent name creator; device. An example is to recognize a dynamic segment as an 

FIG. 11 is an example of the fragment request handler; 55 object fragment. Consider another example where a segment 

FIG. 12 is an example of the fragment cache manager: and can not be rendered fr°m me markup language description 

np 1 r*u 1- *i • by a simple client device such as WINDOWS CET™ -based 

FIG. 13 depicts an example of the client logic y . . n . . . 

r r to Internet appliances. By recognizing the segment as a sepa- 

DETAILED DESCRIPTION rate °bj ect fragment, the client can process and/or cache the 

60 remaining document and let the proxy server interpret the 
FIG. 1 depicts an example of an Internet environment markup language describing the fragment and generate an 
adaptable to the present invention. As depicted, a client appropriate version for the client. This limitation on the 
(60 ... 63 ) may be connected through a network (25) to client devices can be either due to limitation on the process- 
access proxy servers (30 . . . 33) or Web content servers ing power or storage capacity of the client device to interpret 
(40 ... 43). The proxy servers and Web servers can provide 65 the markup language and generate the object fragment, the 
caching of frequently access Web objects to improve client limitation on the bandwidth available to the client device to 
access time. The client may also have its own cache. Those retrieve the DTD of the fragment or other limitations. 
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The second criterion is based on the tradeoffs of process- As depicted the table (505) includes a plurality of entries 

ing and storage or bandwidth requirements to recognize and (507), where each table entry (507) points to a fragment 

separate out a segment as an object fragment so it can be description list (510) (only one shown for ease of 

cached separately and reused to avoid going through inter- description). The list (510) includes one or more description 

preting the markup language description of the object to 5 elements (520 and 525). Each fragment that maps to a given 

generate it again. This will improve response time and entry in tie fragment description table (510) has a unique 

reduce server load on fragment re -references. Each description element (520) on the fragment description list 

fragment — once separated out — may need to be requested (510) of the entry. The description element includes several 

separately with additional requests from the client. Thus, fields: Nlink (530); Fname (535); and Fdescription (540). 

preferably, only a segment or group of segments that meet 1Q The Fname (535) is the persistent name of the fragment. This 

a certain threshold on the processing requirements of inter- name is given by the persistent name creator routine (with 

preting the markup language description to generate the details depicted in FIG. 10). The Fdescription (540) is the 

object segment is recognized as a fragment. Another con- fragment description. The Nlink (530) points to the next 

sideration is the additional storage requirement to store the description element (525) which maps to the same fragment 

rendered segment. For example, consider two cases. In a first 15 description table entry (507). 

case, the processing time is 100 second of CPU time to FIG. 6 depicts an example of the server logic (240). In 

generate the segment from the description, and the size of step 605, the server waits for input. Depending upon the type 

the rendered segment is 10K bytes. In a second case, the of input, the appropriate routine will be invoked. If at step 

processing time is 1 second of CPU time to generate the 610, the input is an object request, the object request handler 

segment from the description, and the size of the rendered ^ is invoked, in step 615 (with details described with reference 

segment is 1000 K bytes. In case 1, the savings on CPU time to FIG. 7). Otherwise, in step 620 it is checked if the input 

is substantial while the additional storage cost is minimal. is a fragment request. For example, in a Web environment, 

The opposite is true for the second case. In other words, only an object request can be identified on the basis that an object 

in the first case is it worthwhile to recognize the segment as name will have as the server part of its URL, the name of a 

a separate fragment for caching. In the preferred M proxy server. If yes, in step 625 the fragment request handler 

embodiment, for an object O, let P(O) be its processing cost (with details described with reference to FIG. 11) is invoked, 

to generate a segment from its description and S(O) be the Otherwise, in step 630 a miscellaneous routine is invoked to 

additional storage requirement to store the segment. A value handle other types of input such as FTP requests which are 

function, F(P(0), S(O)), based on processing costs and orthogonal to the current invention and thus will not be 

storage requirements is used to determine the value of 30 described further. 

recognizing a fragment. An example of a value function (F) piG. 7 depicts an example of the object request handler, 

will be processing cost (in seconds) divided by the square j n s t ep 705, it is first checked whether the requested object 

root of the additional storage requirement (in 100 Kbytes js cached in the object cache maintained by this computing 

increments). When the value function exceeds a given not j e . If the object is cached, in step 710, the cached object 

threshold (say 5), the segment will be recognized as a 3S ^ returned to the requesting node. Otherwise, in step 715, 

fragment. me re quest is forwarded to the content server (or another 

FIG. 3 depicts an example of a document with 3 segments. proxy server). In step 720, the computing node waits for the 

As discussed, the first segment (330-,) begins with a start- object requested. In step 725, after receiving the object, the 

tag, <cml: molecule>, and ends with an end-tag, </cml: object parser (with details described with reference to FIG. 

moleculo and the second segment begins with a start-tag, 40 8) is invoked to analyze the object description and create 

<m: order>, and ends with an end-tag, </m: order>. The fragments. In step 730, the object description, which may 

second segment (330^ includes a third segment (330 3 ) have been modified by the object parser, is sent back to the 

nested within it. The third segment begins with a start-tag, requester. In step 735, the object cache manager is invoked 

<db: price>, and finishes with an end-tag, </db: price>. to determine whether the object description (which may 

Assume the semantics of the three segments as follows. 45 have been modified by the object parser) should be cached 

Assume the first segment provides an image of a molecule in the object cache. The object cache manager is similar to 

structure of a chemical compound. Assume also the second a conventional Web cache manager that caches the Web 

segment contains a formula to generate an order table objects. Any standard cache management policy, such as 

showing the price at different quantities. Assume further, the LRU (least recently used), or its variants to take into 

third segment retrieves the price information from the prod- 50 consideration on tradeoffs between object size, update 

uct database. Hence it is a segment with dynamic informa- frequency, and time since last reference (i.e., the reference 

tion. frequency) can be used. See for example, C. Aggarwal, et al., 

FIG. 4 depicts an example of a modified Web document "On Caching Policies for Web Objects", IBM Research 
after the persistent fragments have been recognized and Report, RC 20619, Mar.5,1997, which is hereby incorpo- 
extracted. Here it is assumed that generating the molecular 55 rated by reference in its entirety, wherein variants of LRU 
structure of the chemical compound in the first segment caching algorithms on Web objects are described. 
(330J is quite complex, whereas the computation of the FIG. 8 depicts an example of the object parser depicted in 
order table is straightforward. Hence, only the first (330^) FIG. 7. By way of overview, the object parser maintains two 
and the third segments (330 3 ') are recognized as persistent stacks — a "tag_stack" and a "segment_stack" — during its 
fragments with the identities, "125.1" and "28.3", respec- 60 processing to identify persistent fragments. The tag_stack 
tively. In the preferred embodiment, each of the persistent includes the "start-tag"s scanned, but whose matching "end- 
fragments is replaced with an "include" statement referring tag"s have not yet been encountered during scanning of the 
to the name of the fragment, e.g. <include HREF="125.1">, object description. The segment_stack includes segments 
indicating the reference to the fragment "125.1," and fol- recognized that are not qualified as fragments, but have the 
lowed by a <include> statement. 65 potential to be combined with segments recognized subse- 

FIG. 5 depicts an example of a fragment description table quently to form a fragment. As depicted, in step 805, the two 

for tracking the object fragment identity and its description. stacks are initialized to null In step 810, a variable, txt, is set 
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equal to the object description. In step 820, a next segment the fragment description list of the corresponding entry in 

locator is invoked (with details described with reference to the fragment description table. In step 1035, the persistent 

FIG. 9) to identify the next segment, Nsegment, in txt. In name created is returned. 

step 825, it is checked if Nsegment is null. If so, the FIG. 11 depicts an example of the fragment request 
processing of txt is completed. Otherwise, instep 830, it will s handler (FIG. 6, step 625). As depicted, in step 1105, it is 
delete segments in the segment_stack that are included in determined which version of the fragment needs to be 
Nsegment, if any. In step 835, it is checked whether Nseg- generated and returned to the requesting client, if multiple 
ment satisfies the fragment creation eligibility criterion. If versions are available. A degenerate case is that only one 
so, in step 840 a persistent name creator routine (with details version is available e.g., a proxy server only has code to 
depicted in FIG. 10) is invoked to create a persistent 10 generate one version of a fragment. In step U10, it is 
fragment identity for the segment. In step 845, the txt is checked whether the requested version is cached in the 
modified to replace the fragment description with an fragment cache. If so, in step 1150, the requested version is 
<include> statement to reference the persistent fragment returned to the requesting node. In step 1160, the fragment 
name followed by an <include> as described in FIG. 4. In cache mana fi er updates the reference statistics. In the pre- 
step 855, the Nsegment is combined with its adjacent peer 15 fem T d embodiment, an LRU cache management policy is 
segments on the segment_stack, if any, where a peer seg- ^d where the requested fragment will be moved to the top 
ment is a segment at the same level (i.e., with the same ° f ±C L ™ chaia In step 1120 for the case where the 
A c *u w * * j i 1 fragment is not m the fragment cache, it obtains the fragment 
parent) of the Nsegmen in a nested markup language des * ri ^ on from mc f * t des ^ ption ^ iHtep 
description In step 860, it is checked if the combined m5 / the f n( fe * nerated b J 6 Qn ^ fr * 
segment satisfies the fragment ere aUon eligibmty criterion. ^ description and the client requirement. In the preferred 
If so, if step 865, these adjacent peer segments are removed embodiment, each type of markup language describing the 
from the segment_stack. Otherwise, in step 870, the Nseg- fragment can have its own DTD to provide its semantic. For 
ment is added to segmenL_stack. C ach type of DTD, there can be different ways of generating/ 
FIG. 9 depicts amore detailed example of the next seg- rendering the fragment based on the characteristics of the 
ment locator (FIG. 8, step 820). As depicted, in step 910, it 25 requesting devices, such as processing power, storage 
is checked if the next token is null, where a token is a capacity, and communication bandwidth. This can be 
consecutive string of characters delimited between blanks described in a GTD (Generator Table Definition) on how to 
(or some other delimiters defined by the markup language). generate a different version for a given DTD to satisfy the 
If so, in step 915, the Nsegment is set to null. Otherwise, in requirement of a specific receiving device. The GTD is 
step 920, it is checked if the next token is a "start-tag" type 30 separate from the DTD. It can be provided by a third party 
token. If so, the token is inserted into the tag_stack with an such as the Internet appliance manufacturer or other soft- 
associated "token position value" set to its starting position ware manufacturer. In step 1135, the request fragment 
in the txt variable. In step 930, it is checked if the next token version is returned to the requester. In step 1140, the 
is an "end-tag" type token. If so, in step 940, the Nsegment fragment cache manager (with details described with refer- 
is set to the substring in txt starting from the token position 35 ence to FIG. 12) is invoked. 

value indicated by the top element of the tag 13 stack to the FIG. 12 depicts an example of the fragment cache man- 

"end-tag" token. In step 945, the top element in the tag 13 ager. In the preferred embodiment, the fragment cache 

stack is removed. manager uses an LRU type replacement policy. As depicted, 

FIG. 10 depicts a more detailed example of the persistent in step 1205, it is checked whether there is enough free space 

name creator (FIG. 8, step 840). As depicted, in step 1005, 40 in the fragment cache to cache the requested fragment (OJ. 

the fragment description is obtained from txt. In step 1010, If so, fragment O c is cached in the fragment cache. Other- 

the fragment description is mapped into a number which wise in step 1215, it determines the rninimura k value such 

corresponds to an entry of the fragment description table. that the bottom k fragments, O^ in the LRU stack of the 

Those skilled in the art will appreciate that there are many fragment cache will have a total size larger than that of 

alternative mapping functions. For example, this can be 45 fragment O c . Ind step 1220, it is checked based on the value 

done by performing an exclusive— or of all the characters in function (f) whether it is more desirable to cache O c or 

the fragment description and then treating the result as an {O w ^ . . . , O bk }. The total processing cost to generate 

integer to divide it by the number of entries in the fragment {0 W( . . . , O bJt }is the sum of the processing cost of each O w , 

description table. The remainder will serve as the index to l<i<k, and the additional storage requirement to store 

the fragment description table. In step 1020, it is checked if 50 {0 Wj . . . , O w } is the sum of the size of each O bi , l<i<k. If 

the segment description already appeared in the fragment O c is more valuable with a large F function value, in step 

description list of the said entry in the fragment description 1225, {O W/ . . , O bk } is deleted to make room to cache O c . 

table. If so, in step 1040, the fragment name of the matching In step 1230, the reference statistics for the fragment version 

fragment description will be returned. Otherwise, in step is updated for the fragment cache manager to manage its 

1025, a new persistent name is created for the fragment. 55 LRU cache. 

There are many ways to create a unique name for the To facilitate garbage collection of fragment descriptions 

fragment. One way is to maintain a counter for each entry of that are no longer in use, an object- fragment table can be 

the fragment description table to track the number of distinct maintained which tracks the fragment created for each 

fragment descriptions that have been mapped to this entry. object and an fragment-object table to track all objects 

The name given to the new fragment will be the value of its 60 containing a common fragment. After an object is updated, 

entry to the fragment description table augmented with the on its next reference, the object parser may detect that the 

current value of the counter associated with the said entry. object now contains some new fragments and some frag- 

For example, if a fragment description is mapped to the 26th ments previously contained in the object are no longer in it. 

entry of the fragment description table and there already It will then check for each fragment no longer in use by the 

have 5 distinct fragments previously mapped io this entry, 65 object whether there is any other object containing it based 

the persistent name for the new fragment will be "26.6". Io on the fragment-object table. If so, the fragment description 

step 1030, the fragment name and its description is added to element in FIG. 5 will be deleted from the fragment descrip- 
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tion table. Finally, the object parser will update the object- A preferred embodiment of the present invention includes 

fragment table and fragment-object table accordingly. For features implemented as software tangibly embodied on a 

each fragment deleted from the fragment description table, computer program product or program storage device for 

the fragment cache manager will be invoked to check if any exeC ution on a processor (not shown) provided with the 

of its fragment version is in the fragment cache and delete s ^ (6Q G) ^ (3() ^ for 

1 " FIG. 13 depicts an example of the client logic. In step ^^are implemented in a popular object-oriented computer 
1305, the client waits for input (request from a user or a executable code such as JAVA provides portability across 
response from the server). Depending upon the type of input, different platforms. Those skilled m the art will appreciate 
the appropriate routine will be invoked. If in step 1310, the Q that other procedure-oriented and object-oriented (00) pro- 
input is an object request from the user, the request is sent gramming environments, including but not limited to C~* 
to the server in step 1315 (see FIG. 6) where persistent and Smalltalk can also be employed, 
object fragments in the object are identified and the object _ , .„ , . , .„ , , , , 
revised as necessary Those skilled m the art will also appreciate that methods 
In step 1320, if the input is an object (e.g., a server of the present invention may be implemented as software for 
response from a previous object request), the object is 15 executl °n °n a computer or other processor-based device, 
rendered and displayed to the user in step 1330. Recall that The software may be embodied on a magnetic, electrical, 
since persistent object fragments have been recognized to optical, or other persistent program and/or data storage 
make the revised object document cacheable at the server or device, including but not limited to: magnetic disks, DASD, 
client device and/or processable/interpre table at the client bubble memory; tape; optical disks such as CD-ROMs; and 
device. Consider the example where a segment can not be 20 other persistent (also called nonvolatile) storage devices 
rendered from the markup language description by a simple suc h as core, ROM, PROM, flash memory, or battery backed 
client device such as WINDOWS CE™-based Internet RAM. Those skilled in the art will appreciate that within the 
appliances. According to the present invention, by recog- spirit md scope of the present invention, one or more of the 
nizing the segment as a separate object fragment the client components instantiated in the memory of the clients (60 . 
can process and/or cache the revised document and allow the 25 . , 63) or scrvcr (30 . 33) could bc acccssed ^ maintained 
server to interpret the markup language describing the ^ yia ^ (26Q) ^ network 2 Qf 
fragment and generate an appropriate version for the client CQuld bc ^ inM across a luralit of servcr5t 
Examples of the limitations on the client device include but 

are not limited to the processing power or storage capacity Now that a preferred embodiment of the present invention 

of the client device to interpret the markup language and 30 has t> ee n described, with alternatives, various modifications 

generate the object fragment; and/or the bandwidth available and improvements will occur to those skill in the art. Thus, 

to the client device to retrieve the description of the frag- the detailed description should be understood as an example 

ment. Recall also that the recognition and revision of an and not as a limitation. The proper scope of the invention is 

object to remove segments qualifying as object fragments defined by the appended claims, 
enable the object fragment to be cached separately and 3s What is claimed is: 

reused to avoid going through interpreting the markup l.Amethod for identifying object fragments in an object, 

language description of the object to generate it again. This sa i d me thod comprising the steps of: 
will improve response time and reduce server load on 

fragment re-references. Each fragment— once removed— analyzmg an object description to identify one or more 
may need to be requested separately with additional requests 40 persistent object fragments associated with the object; 
from the client. Thus, preferably, only a segment or group of creating the one or more persistent object fragments, in 
segments that meet a certain threshold on the processing response to said analyzing; and 

requirements of interpreting the markup language descrip- creating a persistent object fragment identity for a per- 
tion to generate the object segment were recognized as a sistent object fragment, based on one or more of: 

fragment by the server. 45 formal descriptors; and an object fragment property. 

In step 1335, the client determines whether the object is 2. The method of claim 1, wherein the object description 

cacheaole. Recall that any dynamic object or object exceed- is based on the formal descriptors, said method comprising 

ing a certain size will be deemed not cacheable at the client the further steps of: 

device, which often has limited caching capacity. According maintaining and tracking the persistent object fragment 
to the present invention, the server uses persistent object so identity and associated formal descriptors; and gener- 
fragment identifiers to replace persistent object fragments a t mg a cacheable object fragment, 

(such as dynamic objects or large segments) in a Web object. 3. The me thod of claim 1, comprising the further steps of 
The revised object is thus more cacheable at the client reyisin ^ Qbjec( descri tion b laci at leas( one 
device, since the server has removed tiie dynamic or large object fr t ^ m associated object 

objects from the object and reduced the size of the object. 55 fr t idemi tQ enaWe Qne 0f mofe of . ^ ^ 
For example recaU the example of an object description for fr ^ a ^ d ^ descri Uon to be cache . 

a purchase order that includes a dynamic part for retrieving . £ „ t nno nr mrt „ - , „ nn A 

. r _ t r ... rr* . able at one or more 01: a server; and a client; and 

the current price or a product from the database. This , 

dynamic part may be a small portion of the purchase order, f™Jng a ™*d object description to the client. 

but would prevent the object from being cached. According 60 4 ™ c mcthod of claim 3 > whcrcin <*>*prismg the further 

to one feature of the present invention for recognizing and ste P s of: 

treating the dynamic part as a separate fragment from the the client receiving and caching the revised object 
object description, the revised document becomes static and description; and 

therefore cacheable. In step 1340, if the object is cacheable, the client receiving a version of the one or more object 
the object is cached at the local client cache. In step 1325, 65 fragments associated with the fragment identity, 
a miscellaneous routine is invoked to handle other types of wherein the version is generated at the server and is 

input, such as a pager message. based on the capability of the client. 
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5. The method of claim 1, further comprising the steps of: 21. The method of claim 1, wherein said step of creating 
receiving a request for an object fragment; a persistent object fragment further comprises the steps of: 
determining whether the fragment is cached, based on the recognizing and separating a segment as an object frag- 

object fragment identity; and me . nt s ° lt can *? e cachcd separately and reused to avoid 

. - At „ A . . t . . „ s going through interpreting a markup language descrip- 

if the fragment is not cached, dynamically generating the tion of the object to generate it again; wherein the 

fragment segment will only be recognized as the object fragment 

6. The method of claim 1, further comprising the step of only if the segment or group of segments satisfies a 
caching the object fragment based on one or more of: a threshold for interpreting the markup language descrip- 
reference frequency; a cache size; and a processing cost. JQ tion based on one or more of: a processing requirement; 

7. The method of claim 1, further comprising the step of: and a storage requirement. 

generating one or more different versions of the fragment; 22- T* 10 method of claim 1, wherein the persistent object 

wherein a version can be determined by one or more of: fragment will have a consistent identity regardless of 

a requesting device property and the fragment descrip- whether it appears in one or more of: multiple objects; and 

ti on j5 multiple times in the same object. 

8. The method of claim 7, further comprising the step of: 23 - A method for caching objects including object 
determining the version of the persistent fragment based on fragments, said method comprising the steps of: 

the requesting device property and the fragment property. a cue nt receiving from a server an object including a 

9. The method of claim 1, wherein the fragment property revised object description wherein at least one object 
includes a processing requirement. 20 fragment has been replaced with an associated persis- 

10. The method of claim 1, wherein the fragment property tent object fragment identity based on one or more of: 
includes one or more of a storage requirement and a band- formal descriptors; and an object fragment property, in 
width requirement. response to a request for the object; and 

11. The method of claim 1, further comprising the steps the client processing the revised object description. 

of: 25 24- method of claim 23, further comprising the step 

identifying an object fragment as a dynamic object frag- °f ; 

ment; and the client receiving a version of the one or more object 

transforming the dynamic object to a static object by fragments associated with the fragment identity, 

revising the object description and replacing one or wherein the version is generated at the server and is 

more dynamic object fragments with its object identity. 30 based on the capability of the client. 

12. The method of claim 1, wherein the fragment property 2S - ^ method of claim 24, wherein the version is 
includes whether the fragment can be generated efficiently generated at the server and is based on the capability of the 
by various client devices. client. 

13. The method of claim 1, wherein the formal descriptors 26 * ^ method of claim 23, wherein the persistent object 
are markup tags in the object description and wherein the 3S fragment will have a consistent identity regardless of 
object is described using a markup language. whether it appears in one or more of: multiple objects; and 

14. The method of claim 1, wherein the object is a Web multiple times in the same object. 

page described using a markup language selected from the 27 • ^ method of claim 23, wherein the formal descrip- 

group consisting of XML, SGML, or HTML. tors m markup tags in the object description and wherein 

15. The method of claim 1, wherein the object fragment 40 the object is described using a markup language. 

can be nested or hierarchical. 28 - Tne method of claim 23, wherein the object is a Web 

16. The method of claim 15, further comprising the steps P a S e described using a markup language selected from the 
of; group consisting of XML, SGML> or HTML. 

recognizing a nested object fragment as including a . The method of clairr i23, wherein said processing step 

dynamic fragment or a frequently changed fragment; ^ mcl ! ldes one ° r . more of cachm S a revised ob J ect and 

anc j rendering the object. 

, . 4 f , , . . 30. The method of claim 23, further comprising the step 

making an outer fragment cacheable at one or more of a r ,v 1* * . r f f. r t 

server and a client client receiving from the server a version of the object 

1 , Se i! er 3 j C ?\' . + c • • iL A fragment interpret and generated at the server, wherein the 

17. The method of claim 1, further comprising the steps , °- „f A < u . e .! 

o £ * ^ F 50 version generated is based on one or more of: the processing 

power of the client; the storage capacity of the client; and the 

identifying one or more of the object fragments requiring bandwidth available to the client to retrieve a description of 

invalidation; and me f ragment . 

garbage collecting invalid object fragments. 31. The method of claim 23, wherein the persistent object 

18. The method of claim 1, wherein the object fragment ss fragment identifier represents a dynamic object, 
property comprises the property selected from the group 32. The method of claim 23, wherein the client is selected 
consisting of: a dynamic property; a static property; how from a group consisting of: a handheld device; a palmtop 
frequently the object is going to change; size; or processing device; a set-top box; a smart phone; or an Internet appli- 
cost to generate that fragment from its description. ance. 

19. The method of claim 1, further comprising the step of 60 33. A program storage device readable by a machine, 
caching the object based on one or more object fragment tangibly embodying a program of instructions executable by 
properties. the machine to perform method steps for identifying object 

20. The method of claim 1, further comprising the steps fragments in an object, said method steps comprising: 

analyzing an object description to identify one or more 

selecting a subset of the segments contained in the object; 65 persistent object fragments associated with the object; 

^ creating the one or more persistent object fragments, in 

recognizing the subset as persistent object fragments. response to said analyzing; and 



04/09/2004, EAST Version: 1.4.1 



US 6,249,844 Bl 

13 14 

creating a persistent object fragment identity for a per- 46. The program storage device of claim 33, wherein the 

sistent object fragment, based on one or more of: object is a Web page described using a markup language 

formal descriptors; and an object fragment property. selected from the group consisting of XML, SGML, or 

34. The program storage device of claim 33, wherein the HTML. 

object description is based on the formal descriptors, said s 47. The program storage device of claim 33, wherein the 

method comprising the further steps of maintaining and object fragment can be nested or hierarchical, 

tracking the persistent object fragment identity and associ- 48. The program storage device of claim 15, further 

ated formal descriptors; and generating a cacheable object comprising the steps of: 

fragment. recognizing a nested object fragment as including a 

35. The program storage device of claim 33, comprising 10 dynamic fragment or a frequently changed fragment; 
the further steps of 

revising the object description by replacing at least one makmg an outer fr agment cacheable at one or more of a 

object fragment with an associated persistent object server and a client. 

fragment identity to enable one or more of: the object 49. xhe program storage device of claim 33, further 

fragment; and a revised object description to be cache- 1 5 comprising the steps of: 

able at one or more of: a server; and a client; and iA^r^ ~ ~ r*u w ^ c 

' identifying one or more of the object fragments requiring 

sending a revised object description to the client. invalidation* and 

36. Hie program storage device of claim 35, wherein ^ Md 

comprising the further steps of: ^ *, ^ program ^ ^ of J*^ ^ ^ 

the client receiving and caching the revised object object fragment property comprises the property selected 

description; and f rom tnc group consisting 0 f: a dynamic property; a static 

the client receiving a version of the one or more object property; how frequently the object is going to change; size; 

fragments associated with the fragment identity, or processing cost to generate that fragment from its descrip- 

wherein the version is generated at the server and is 2 5 tion. 

based on the capability of the client. 51. The program storage device of claim 33, further 

37. The program storage device of claim 33, further comprising the step of caching the object based on one or 
comprising the steps of: more object fragment properties. 

receiving a request for an object fragment; 52. The program storage device of claim 33, further 

determining whether the fragment is cached, based on the 30 comprising the steps of: 

object fragment identity; and selecting a subset of the segments contained in the object; 

if the fragment is not cached, dynamically generating the m ^ 

fragment. recognizing the subset as persistent object fragments. 

38. The program storage device of claim 33, further 53. The program storage device of claim 33, wherein said 
comprising the step of caching the object fragment based on 35 step of creating a persistent object frament further comprises 
one or more of: a reference frequency; a cache size; and a me s * e P s °f ; 

processing cost. recognizing and separating a segment as an object frag- 

39. The program storage device of claim 33, further ment so it can be cached separately and reused to avoid 
comprising the step of. going through interpreting a markup language descrip- 

generating one or more different versions of the fragment; 40 ^on of the object to generate it again; wherein the 

wherein a version can be determined by one or more of: segment will only be recognized as the object fragment 

a requesting device property and the fragment descrip- onlv if m e segment or group of segments satisfies a 

tion. threshold for interpreting the markup language descrip- 

40. The program storage device of claim 39, further 45 tion based on one or more of: a processing requirement; 
comprising the step of: determining the version of the anc * a storage requirement. 

persistent fragment based on the requesting device property 54 • program storage device of claim 33, wherein the 

and the fragment property. persistent object fragment will have a consistent identity 

41. The program storage device of claim 33, wherein the regardless of whether it appears in one or more of: multiple 
fragment property includes a processing requirement. 5 objects; and multiple times in the same object. 

42. The program storage device of claim 33, wherein the 55 - A program storage device readable by a machine, 
fragment property includes one or more of a storage require- tangibly embodying a program of instructions executable by 
ment and a bandwidth requirement. me machine to perform method steps for processing objects 

43. The program storage device of claim 33, further including object fragments, said method steps comprising: 
comprising the steps of: 55 a client receiving from a server an object including a 

identifying an object fragment as a dynamic object frag- revised object description wherein at least one object 

ment; and fragment has been replaced with an associated persis- 

transforming the dynamic object to a static object by <f Qt ob J cct fra g me nt identity based on one or more of: 

revising the object description and replacing one or formal descriptors; and an object fragment property, in 

more dynamic object fragments with its object identity. <so response to a request for the object; and 

44. The program storage device of claim 33, wherein the the client processing the revised object description, 
fragment property includes whether the fragment can be 5 <>- The program storage device of claim 55, further 
generated efficiently by various client devices. comprising the step of: 

45. The program storage device of claim 33, wherein the the client receiving a version of the one or more object 
formal descriptors are markup tags in the object description 65 fragments associated with the fragment identity, 
and wherein the object is described using a markup lan- wherein the version is generated at the server and is 
guage. based on the capability of the client. 
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57. The program storage device of claim 56, wherein the 62. The program storage device of claim 55, further 
version is generated at the server and is based on the comprising the step of: the client receiving from the server 
capability of the client. a vcreion of the objcct fragmcnt mtcrprct generated at 

58. The program storage device of claim 55, wherein the u • *u * * j * l j 

«,rc-ct OI ,t f, b . • . • j . • „ the server, wherein the version generated is based on one or 

persistent object fragment will have a consistent identity 5 6 

regardless of whether it appears in one or more of: multiple morc of: thc Pressing power of the client; the storage 

objects; and multiple times in the same object. capacity of the client; and the bandwidth available to the 

59. The program storage device of claim 55, wherein the client to retrieve a description of the fragment. 

formal descriptors are markup tags in the object description 63. The program storage device of claim 55, wherein the 

and wherein the object is described using a markup lan- 10 persistent object fragment identifier represents a dynamic 

S" 8 ^' object. 

60. The program storage device of claim 55, wherein the ^ L > j • , , . « t. . . 
object is a Web page described using a markup language 64 * ^ P ro S ram stora 6 e device of claim 55 > wherem the 
selected from the group consisting of XML, SGML, or client 1S selected from a group consisting of: a handheld 
HTML. 15 device; a palmtop device; a set- top box; a smart phone; or an 

61. The program storage device of claim 55, wherein said Internet appliance, 
processing step includes one or more of caching a revised 

object and rendering the object. * * * * * 
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